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ABSTRACT 

In  this  paper  the  analysis  of  estimates  operating  on  dependent  data  is 
considered.  Special  dependent  data  structures  are  considered  and  the  analysis 
is  made  for  three  different  choices  of  contamination  and  performance  distance 
measures. 

For  contamination  and  performance  measures  both  being  Levy  (Hampel  model) 
an  analysis  that  is  particularly  oriented  toward  fast  convergence  of  the  esti- 
mate to  a value  that  is  stable  (robust)  inside  the  contaminated  family  is 
undertaken.  The  minimum  sample  size  to  satisfy  certain  performance  is  inves- 
tigated and  it  is  found  that  the  problem  reduces  to  finding  continuous,  ab- 
solutely bounded  estimates  with  logarithms  of  their  moment  generating  function 
slowly  increasing  with  the  absolute  value  of  the  argument  for  all  data  distri- 
butions considered. 

1,  INTRODUCTION 

Humans  have  always  taken  satisfaction  in  the  game  of  "outsmarting  Nature. 
This  game  becomes  particularly  intriguing  when  Nature  tries  to  confuse  its 
human  opponents  by  slightly  changing  the  underlying  statistical  rules  that 
specify  its  behavior.  To  beat  Nature's  game,  then,  the  human  player  must 
follow  strategies  that  are  not  very  sensitive  to  such  statistical  changes. 
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The  design  of  such  statistical  variation-resistant  strategies  varies  with 

the  particular  objective  of  the  human  player  at  the  time  and  the  decision 

rules  evolving  from  this  design  are  called  robust. 

Robust  decision  rules  have  been  studied  by  several  statisticians  [1,3,7, 

10,12]  and  engineers  [11,13,18]  from  different  points  of  view.  The  promising 

qualitative  analysis  of  robust  estimation  first  introduced  by  Hampel  [10] 

has  been  used  and  extended  [12,18],  but  the  performance  criterion  has  always 

been  expressed  through  the  Levy  distance  of  the  data  as  well  as  the  estimator 

distributions.  Specifically,  an  estimate  s(X  ) using  the  data  X has  been 

n n 

called  robust  (or  weakly  robust  according  to  [18])  at  some  distribution  Q , 

o 

if  every  distribution  Q that  is  close  to  Qq  in  the  Levy  distance  sense 

results  to  s(X  ) distribution  that  is  Levy  distance  close  to  the  s(X  ) dis- 
n n 

tribution  resulting  from  Q^-distributed  data.  This  definition  of  robustness, 

while  leading  to  constructive  analysis  of  robust  estimators  [10,18],  has  two 

disadvantages:  It  does  not  offer  convergence  rates  of  the  estimates  for  large 

number  of  data  and  it  may  be  too  demanding  or  even  not  representative  enough 

for  some  applications.  In  addition,  the  constructive  analysis  of  "Levy  robust" 

estimators  has  been  accomplished  only  for  independent  X^  components. 

In  particular,  one  of  the  properties  that  characterize  an  estimate 

s(X  ) that  is  "Levy  robust"  at  some  one-dimensional  distribution  F and  is 
n o 

applied  on  independent  data,  is  continuity  at  Fq  [10,18].  This  property, 

which  actually  means  closeness  of  the  values  s(X  ),  s(X  ) for  vectors  X , 

n m n 

X that  specify  experimental  distributions  that  are  both  close  to  F , guar- 
m o 

antees  convergence  of  s(Xr)  to  some  constant  depending  on  Fq  , but  it  does 
not  specify  the  convergence  rate. 


Also,  in  s one  applications  the  preservation  of  some  specific,  less  general 
than  the  Levy  distance,  characteristic  may  be  desirable.  In  this  case  the  per- 
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formance  criterion  Is  different  and  so  are  the  desirable  properties  of  the 
estimator.  Performance  criteria  that  are  easier  to  calculate  than  the  Levy 
distance  may  be  preferred. 

Finally,  the  design  of  estimators  that  are  "robust"  in  the  presence  of 
dependent  data  structures  is  certainly  a problem  that  is  challenging  as  well 
as  realistically  interesting. 

In  section  2 of  the  present  paper  some  preliminary  discussion  on  pre- 
vious analysis  of  robustness  on  statistically  contaminated  distribution  and 
on  possible  spaces,  different  performance  criteria  is  presented. 

In  section  3,  the  design  of  "Vasershtein  robust"  estimators  is  under- 
taken. The  dependence  structure  of  the  data  is  naturally  incorporated  in 
this  case  if  the  distortion  or  penalty  measure  is  square  difference. 

In  section  4,  properties  of  estimators  that  are  robust  as  mappings  from 
a data  Levy -contaminated  space  to  an  estimate  space  characterized  by  a Vaser- 
shtein performance  criterion  are  discussed. 

In  section  5,  a design  method  of  "Levy  robust  estimates"  is  presented 
that  incorporates  an  exponential  convergence  rate.  Special  dependence  struc- 
ture of  the  data  is  considered. 

Section  6 includes  examples  of  estimators  that  are  "robust"  in  the  senses 
of  sections  3,  4 and  5. 

2 . PRELIMINARIES 

Robustness  has  lately  been  defined  [10,18]  as  stability  of  some  stochastic 

distance  measure  defined  on  the  estimator  probability  space.  Specifically, 

if  X denotes  the  vector  of  n discrete  data,  Q is  some  well-known  multi- 
n o 

dimensional  cumulative  distribution  applied  on  X (where  Q is  the  n- 

n on 

dimensional  distribution  evolving  from  Qq) » Q is  some  arbitrary  cumulative 

distribution  on  X ; n = 1,2,...;  d. (*,•)  is  a stochastic  distance  measure 

n 


4 


defined  on  the  data  distribution  space:  s (X  ) is  a scalar  estimate  that  is 

n n 

a deterministic  function  of  the  data  X ; D(s  ),  D°(s  ) are  the  distributions 

n*  n' * v n 

of  s (X  ) determined  through  Q,  Q°  respectively;  and  d0  (•,•)  is  a 

n n Zn  < 

stochastic  distance  measure  defined  on  the  distribution  space  DCs^),  then  the 

sequence  (s  } is  weakly  robust  at  Q°  (as  defined  in  f 18])  if  given  c > 0 , 
n 

there  is  some  fi(e)  > 0 such  that:  For  every  Q satisfying  d1(Q°,Q)  < 6(e)* 

d_  (D°(s  ) ,D(s  ))  < e;  V n is  implied. 

Zn  n n 

It  is  obvious  from  the  above  definition  of  weak  robustness  that  the  sto- 
chastic distances  d^(*,*)»  ^2n^** must  chosen  carefully  to  satisfy  the 
designer's  specific  objective.  In  particular,  the  distance  d^(*,*)  repre- 
sents the  kind  of  Q°  statistical  contamination  considered,  while 
is  the  performance  measure  of  the  estimate. 

In  the  choice  of  d^(*,*)  and  ^2n^**  ®oot*  rePresentat^on  of  the 

particular  model  as  well  as  the  calculability  of  the  distances  must  be  taken 
into  consideration. 

In  the  study  presented  in  [17],  it  became  apparent  that  among  the  plethora 
of  stochastic  distances  existing,  there  are  some  more  and  some  less  approach- 
able. Specifically,  the  Levy  distance  although  useful  in  the  robust  analysis 
is  often  very  hard  to  calculate.  On  the  other  hand,  the  Vasershtein  distance 
is  simpler  in  some  cases  and  if  also  representative  of  the  model  considered, 
it  becomes  an  excellent  contamination  or  performance  measure  choice. 

In  the  present  study,  only  Levy  and  Vasershtein  distance  choices  will  be 
considered.  Therefore,  their  definition  and  seme  of  their  properties  that 
are  related  to  the  analysis  in  this  paper  are  presented. 

Both  Levy  and  Vasershtein  distances  include  a distortion  or  penalty  mea- 
sure p( • , • ) applied  on  the  outcomes  of  the  distributions  involved.  Specifi- 
cally, if  we  concentrate  our  attention  on  discrete  data  structures,  let  X , 

ir 


5 


Y be  two  different  n-data  values  and  let  p(X  ,Y  ) be  their  relative  distor- 
n r n*  n 

tion.  Also,  let  Q°,  Q be  two  different  cumulative  distributions  of  X 

n’  n n 

Then,  the  Levy  distance  d^  an<*  ^asers^te^n  distance  d^  (Q°,Qn) 

p n n * p n n 

are  defined  as  follows : 


d_  (Q°,Q  ) = inf { e :Q°(X  ) < Q (UY  : p(X  ,Y  ) < e)  + e , 
L n n n'  n — n n r n n — * 

P 


VV  2 Qn(1"r.:p(Xn'V  <«)+»;  V xj 

(1) 

p 

‘lnf  \n(-,-)tp(Xn-V’ 

(2) 

all  (^  (*,*)  inducing 

Q°,Qn  marginals 

The  same  distances  can  be  defined  for  general  multivariate  distributions  Q°,Q, 
inducing  Q°,Qn  V n in  the  following  way: 


dL  (Q°,Q)  - sup  dL  (Q°,Qn)  (3) 

P n P 

<V  (Q°>Q>  " 8UP  \ (Qn»Qn}  (4> 

P n p 

In  [18]  it  was  shown  that  d^  (Q°,Q  ) is  nondecreasing  with  increasing  n 

o P 

for  Q^,  being  both  products  of  one-dimensional  distributions.  It  is 

straight  forward  to  show  that  the  same  is  true  for  arbitrary  Q°,  Q . The 

Vasershtein  distance  d^.  (Q°,Q  )»  on  the  other  hand,  can  be  independent  of 

P 

n for  stationary  data  structures  and  proper  choice  of  the  distortion  (penalty) 
measure  p(*,*)  . Indeed,  let  X^  = (x^;  i = l,...,n)  be  a sample  vector 
from  a wide-stationary  process  x(t)  whose  autocovariance  function  is  R°(t) 
and  whose  n-dimensional  discrete  distribution  is  represented  by  Q°  . Also, 
let  ^n  be  tbe  discrete  representation  of  another  wide  stationary  process 
y(t)  with  autocovariance  function  R(T)  . If  the  Vasershtein  distance  is 
defined  in  the  space  of  jointly  stationary  distributions  and  if 
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p(X  ,Y  ) = -^{X  - Y ] '[X  - Y ] , where  [ ] 1 means  transpose,  then  as  it 


n'  n'  n*  n n'  * n n' 
was  found  in  [15]  and  [17],  one  obtains 


dV  (Qn»V  = inf  tR°(0)  + R<°>  ” 2RC(°)  + (®°  “ m)  ; 


(5) 


H RC(0) 

c o 

where  R (t)  is  the  crosscovariance  function  of  x(t),  y(t)  , m is  the 

mean  of  x(t)  and  m the  mean  of  y(t)  . The  expression  in  (5)  is  obviously 

independent  of  n . Furthermore,  if 

TT 
P 
•TT 

(7) 


P°(X)  = Z R°(k)ejkX  * R°(0)  • (2tt>"1J>  P°(X)dX 
k=-»  -tt 

00 

P(X)  = E R(k)ejkx  =»  R(0)  = (2rr)"1J‘  P(X)dX 


(6) 


k=-® 


-TT 


o c 

if  Qn,Qn  are  Gaussian,  then  the  R (0)  that  satisfies  the  infimum  in  (5) 
was  found  in  [15]  to  be  given  by  the  expression: 

RC(0)  = (2tt)-1J  /p°(X)P(X)  dX 


(8) 


-TT 


which  provides  the  following  expression  of  the  Vasershtein  distance  for  sta- 
tionary distribution  Gaussian  spaces; 

dy  " dy  (Q°,Q)  - (2tt)“1jVp°(X)  - /HX))2d\+  (m°  - m)2 

P P “TT 

(9) 

where  p(X  ,Y  ) - - [X  - Y ] '[X  - Y ] (10) 

K n*  n'  n 1 n nJ  1 n nJ  v ' 

If  Q°,  Qn  are  not  Gaussian,  the  expression  on  the  left  of  (9)  becomes  a 

lower  bound  on  d^  (Q°,Qn)  * 
vp  n n 

The  expression  in  (9)  is  simple,  computable  analytically  in  most  cases, 
and  independent  of  the  sample  size  n . It  is  also  significant  to  observe 
from  (9)  that  the  wide  sense  stationary  Gaussian  distributions  that  are  "close" 
with  respect  to  a square  error,  are  the  ones  whose  means  and  discrete  spectra 
functions  are  "close." 
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The  constructive  analysis  of  "robust"  estimators  that  was  presented  in 
[10]  and  [18]  in  addition  to  being  "Levy  robust,"  it  was  also  limited  in  the 
case  of  independent  data  structures.  It  was  found  there  that  the  sequence 
(sn(Xn)}  of  estimates  is  weakly  robust  at  some  in  the  sense  of  both  the 

d, (•»•)»  d_  (•»*)  distances  being  Levy,  if:  a)  s (X  ) is  continuous  for 
every  n as  a real  function  with,  En  Euclidean  domain,  b)  s^CX^)  is  con- 
tinuous at  Q , , this  continuity  meaning  that  for  every  X , X such  that 
oi  n m 

they  determine  experimental  distributions  n^.  (y) , n^  (y)  that  are  both  Levy 


n 


m 


close  to  Q , , it  is  implied  that  Is  (X  ) - s (Y  ) I is  small, 
oi  n n mm 

The  above  analysis  does  not  provide  convergence  rates  of  the  estimate 

s (X  ) . Such  a rate  is  important  to  the  designer  that  is  looking  not  only 
n n 

for  "robust"  estimates  but  also  for  sufficient  sample  sizes  also.  It  is  im- 
portant to  have  an  analysis  that  answers  the  double  question:  "What  kind  of 

estimate  will  be  robust  for  a given  contaminated  family  and  how  many  data 
are  sufficient  to  guarantee  a certain  minimum  level  of  performance  inside  the 
same  family?" 

An  attempt  to  answer  the  above  question  for  certain  dependent  data  struc- 
tures is  even  more  valuable.  Having  the  dependent  data  situations  in  mind, 
we  will  first  study  estimates  that  are  robust  for  contamination  and  perform- 
ance measures  that  are  not  both  Levy.  Then,  we  will  present  an  analysis  that 
although  applied  to  Levy  distribution  contamination  and  Levy  performance  mea- 
sure, it  defers  from  the  classical  one  in  [10]  and  [18]  in  the  fact  that  it 
can  apply  to  dependent  data  and  it  incorporates  the  convergence  rate  of  the 
estimates . 


3.  VASERSHTEIN  ROBUST  ESTIMATORS 

In  this  section  we  will  assume  that  the  human  player  has  the  information 
that  Nature  uses  a Vasershtein  algorithm  to  contaminate  its  underline  statistics. 
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That  Is,  we  consider  the  measure  of  contamination  to  be  the  Vasershtein  dis- 
tance and  we  pick  as  distortion  measure  the  square  error  one.  If,  in  addition, 
Nature  picks  its  statistics  from  the  wide-sense  stationary  distribution  space 
% that  surrounds  a well-known  distribution  Q°  , then  the  Vasershtein  distance 
between  Qq  and  an  arbitrary  member  Q € is  given  by  (as  expressed  in  (9) 

of  section  2) : 

\ (Q°,Q)  > (2rr)"1J'  (/p°(X)  - /POT  )2dX  + (m°  - m)2  (11) 

p -TT 

where 

p(X  ,Y  ) - ifX  - Y ] '[X  - Y ] (12) 

K n n nl  n nJ  1 n nJ  v ' 

m°,m  the  one-dimensional  means  corresponding  to  Q°  and  Q and  P°(X), 

P(X)  the  respective  discrete  spectral  densities.  We  have  equality  in  (11) 

if  Q°,Q  distributions  are  both  Gaussian, 
n n 

From  (11)  it  is  apparent  that  the  Vasershtein  p-contaminated  Gaussian 
distribution  families  are  the  families  with  contaminated  means  and  spectral 
densities. 

For  consideration  -of  arbitrary  Q distributions  (even  when  Q°  is 
Gaussian)  we  can  transform  the  contamination  measure  to: 

dP  m(Q°,Q)  * (2tt)~1J'  </p°t?,) “-  /P(X))2dX  + (m°  - m)2  (13) 

’ - -TT 

In  other  words,  we  suppose  that  nature  contaminates  the  data  -statistics  by 
contaminating  the  spectral  density  and  the  mean  of  the  underline  stationary 
process^- 

Let  us  now  suppose  that  the  observer's  performance  measure  is  the  mean- 

square  one.  That  is,  if  p parameters  must  be  estimated  from  the  collected 

data  X , then  the  robustness  of  the  p-dimensional  estimate  S (X  ) is 
n n n 

evaluated  through  a mean  square  error  value.  In  other  words,  the  estimate 
designer  is  fully  satisfied  if  the  average  mean  square  distortion  between 
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the  S (X  ) he  calculates  when  Nature  uses  Q underline  statistics  and  the 
n n 

value  S (Y  ) he  finds  when  Q is  true,  remains  small  when  Q°,Q  are  close 
n n 

enough  in  the  Vasershtein  distance  sense. 

If  D°(S  ),  D(S  ) are  the  distributions  of  S (X  ) evolving  from 
n n n n 


Q ,Q  respectively,  then 


where 


“v  (D°(Sn),D(Sn))  = p-‘  £ inf  Er  , 15^)  - 

0 i=l  ni 


{s  (X  ) - s (Y  ) j (14) 


rnl(«,->  inducing  D°(sin) »D(®ln> 


S (X  ) - (s.  (X  );  i = l,...,p) 
n n'  in'  n « 


(15) 


If 


E (s.  (X  )}  = m°  ; D . * . C s . (X  ))  = m 

_o,^  . in'  ny  in  D(s.  ) in'  n'  in 

D (sin)  in 


(16) 


En=,-  'I8ln<V  '“la’25  -»ta<0»  Vs,  )ftSln(V  - “ini"5 

D (sin)  v in' 


°ln<°> 


(17) 


The  following  lemma  can  be  expressed: 

Lemma  1 

The^itftance_£itr  (14)  is  boundedJirdm  below  by  the  expression 

Vm<D°<§,,>-D<§n»  " ‘ ^W)2  + 0»°n_-_  »t/)  <W 


Proof- 


For  any  joint  distribution  rn£(*»’)  with  D°(s£n),  D(sin)  marSlnals> 
if  ^a^n(0)  the  cross covariance  determined  by  r^(  *.,•)-  * then,  the  matrix 
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in(°> 

°in<0> 

in<°> 

oln<0> 

must  be  nonnegative  definite. 
Therefore, 

I 


Then, 


a?  (0)  a . (0)  > cr?  (0) 
m in  — in 


E , As.  (X  ) - s.  (Y  )32  = a°  (0)  + a.  (0)  - 2 oC.  (0) 
r (♦,•)  in'  n in'  n7  in'  7 in'  7 in'  7 


+ ("in  - "t/  > + oin<0>  ' »i„<0)  + <"in  ‘ ”1/ 


and 


* wi2  + «*?.  - »,„>2 


in 


in  in 


dv  (D°(Sn),D(Sn))  > p"1  l C[/a°n(0)  - + Cm°n  - min)2} 

p i=l 


The  expression  in  (18)  is  equal  to  the  distance  in  (14)  if  and  only  if 
D(s^n)  can  be  the  distribution  of  a linear  transformation  of  the  variable 
distributed  as  in  f°r  every  1 < i < p . 

We  will  summarize  the  observations  we  made  up  to  now  by  the  following 
three  definition: 


Definition  1 

We  will  call  a sequence  (§n3  of  p-dimensional  estimates  p-Vasershtein 
weakly  robust  inside  a wide  sense  stationary  distribution  family  ? and  at 
some  Q°  if  and  only  if  given  e > 0 , there  is  some  6(e)  > 0 such  that: 
For  P°(X),m°  being  the  spectral  density  and  the  mean  induced  by  Q°,  P(X),m 
being  the  spectral  density  and  mean  of  some  Q € % and  for 
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(2tt)"1JTT(/p°(X)  - /P(X))2dX  + (m°  - m)2  < 6(e) 

-TT 

it  is  implied  that: 

p"1  S {[/°n(0)  - SZTJO)]2  + (m°n  - min)2}  < e ; V n 
where  <*^(0),  a£n(0)»  min*  min  are  ®^ven  by  (17)  and  (16)  respectively. 
Observations 

1.  If  the  ? family  in  definition  1 is  a Gaussian  stationary  family  and 
the  estimates  ®in(xn)  are  linear  transformations  of  the  data,  then  the 
expression 

(2tt)_1J*  (/p°(X)  - yFoJ)2  + (m°  - m)2 

-TT 

p"1  l ((A°(0)  - /o  (0))2  + (rn®  - m.  )2) 

xn  in 

are  the  exact  Vasershtein  distances  of  the  distributions  Q°,  Q and  D°(s  ). 

in  * 

D(s^) , respectively. 

2.  It  is  evident  from  definition  1 that  since  we  want  the  closeness  of  the 
estimate  means  and  variances  guaranteed  by  the  closeness  of  just  the  spectral 
densities  and  the  means  of  the  data  distributions,  we  must  limit  the  estimates 
to  linear  transformations  of  the  observations. 

Let  us  define  the  linear  estimates 

SW  - C19) 

k=l 

where  XQ  = (x^;  k = l,...,n)  and  the  coefficients  ani(k)  are  real,  scalar, 
and,  in  general,  different  for  different  dimensionality  n . To  study  ro- 
bustness, we  need  the  means  and  variances  of  the  estimate  in  (19)  under  data 
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distributions  Q and  Q . Indeed,  we  have 


m = e{s,  (X  )/Q°  distr}  = m E a (k) 
in  m n , , ni 

k=l 


n 

m = eCs  (X  )/Q  distr}  = m E a (k) 
in  m n , - m 

k=l 

- ">°>1 

k,X=l 


(20) 


= a .(k)a.(je)R°(k  - i) 

k,X=l  nL  nl 

- (2TT)’1rTTp°(X)  EE  a .(k)a  .(l)e"JCk"‘e)X  dX 
-TT  k,A-l  111  ni 

where  Q°  is  wide  sense  stationary  with  autocovariance  R°(t)  and  power  spec- 
tral density  P°(X). 

From  the  above  expression  we  finally  obtain: 

TT  n 


o°n(0)  = (2tt)'1J  P°(X)II  Z ani(k)e"jkX||2dX 


-TT  k=l 


(21) 


and  similarly: 


CTin(0)  = (2TT)"1!  m)ll  E ani(k)e"JkX||2dX 


(22) 


■TT  k=l 

for  the  variance  under  distribution  Q . Applying  the  Schwartz  inequality: 

J*  /P°(X)  /POOl I E ani(k)e”jkX|l2dX  < [/p  (X)ll  E a (k)e"jkX||2dX]^  • 


-TT 


k=l 


o , - ni 

-tt  k=l 


• IJ*TP(X)I1  E ani(k)e“jkX|l2dXl^ 


-TT  k=l 


on 


l/o 


we  obtain  from  (21)  and  (22): 
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[/°n(0>  “ /ain(°)]2  < - A0012|l  E a (k)e'jkx||2d\ 

-n  k=l  i 

(23) 

while  from  (21)  we  get  directly: 


(m°n  “ min)2  = <m°  " m>2[  2 an^(k)] 2 


k=l 


(24) 


The  expressions  in  (23)  and  (24)  express  the  connection  between  data  and 
estimate  statistics  needed  to  specify  p-Vasershtein  weak  robustness,  as  given 
by  definition  1. 

From  the  analysis  done  above,  and  the  expression  of  p-Vasershtein  weak 
robustness  in  definition  1,  a lemma  offering  a constructive  properties  of 
estimates  that  are  Vasershtein  weak  robust  is  obtained. 


Lemma  2 

A p-dimensional  estimate  S (X  ) that  is  p-Vasershtein  weak  robust,  as 

n n 

expressed  by  definition  1,  must  be  linear.  If  this  linear  estimate  is  given 
by  the  expressions: 


^n^n^  ^Sin(Xn);  1 “ sin(Xn)  “ ' E V (k)xk 

k=l  i 

a sufficient  condition  for  the  present  sense  of  robustness  is  that  the  sequences 


n 


C E | a (k)|) 


k=l  ni 


converge  to  some  finite  value  for  every  1 < i < p . 


Proof: 

n 

Let  the  sequence  { E |a  (k)  |)  converge  to  some  A.  < <*  . Then,  also 

k=l  ni  x 


( E a (k))2  < ( E |a  (k)  |)2  < A2  < max  A2;  V n . 
k=l  ni  k=l  ni  1 “ l<i<p  1 
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Given  e > 0 , pick 

«(■)  = 5 j 

2p  • max  A. 

1<1<P 

and  the  conditions  In  definition  1 are  satisfied. 

1 n 

According  to  lemma  2,  the  experimental  mean  s.  (X  ) = — Ex.  is  a 

in  n n i 

p-Vasershtein  weakly  robust  estimate.  The  condition  of  lemma  2 can  be  easily 
seen  to  satisfy  robustness  (not  just  weak  robustness)  properties  inside  some 
data-distribution  contaminated  family. 

Concluding  this  section,  we  want  to  emphasize  that  the  robustness  struc- 
ture presented  here  considers  dependent  data  with  dependence  expressed  by  ar- 
bitrary wide-sense  stationary  distributions.  This  dependence  was  explicitly 
incorporated  in  the  robustness  only  through  the  spectral  densities  of  these 
distributions . 

4.  LEVY-VASERSHTEIN  ROBUST  ESTIMATORS 

In  this  section  we  consider  the  case  that  the  contamination  measure  on 
the  data  distribution  space  is  the  Levy  distance,  while  the  performance  mea- 
sure on  the  space  of  the  estimates  is  the  Vasershtein-type  distance 

dVt(D°(§n)’D(V)  = P’1J1(l^in<0)  " + (®in  " min)2  (25) 

The  characteristics  a°n(0),  a£n(0)>  m°n*  min  are  ®^ven  ky  expressions  (16) 
and  (17)  of  the  previous  section  and  the  distance  in  (25)  is  equal  to  the 
Vasershtein  distance  again  if  the  data  distribution  family  is  a wide  sense 
stationary  family  and  the  estimates  are  linear. 

The  robustness  considered  here  is  precisely  expressed  by  the  following 


definition. 
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Definition  2 

A sequence  of  p-dimensional  estimates  is  weakly  p-Lwy- 

Vasershtein  robust  at  some  distribution  Q°  if  and  only  if:  given  e > 0 , 

there  is  some  4(e)  > 0 such  that  for  every  distribution  Q satisfying: 

dL  (Q°,Q)  < 4(e) 

P 

it  is  implied  that 

\ (D°(Sn),D(Sn))  < e ; V n . 

The  distance  d^  (Q°,Q)  is  defined  by  expressions  (1)  an  (3)  and  the 

p 

distance  (D  (Sn),D(§n))  by  expression  (25).  The  dependence  structure 
of  the  data  (as  expressed  by  Q°  and  Q)  is  arbitrary  at  this  point. 

In  the  analysis  for  the  discovery  of  constructive  properties  of  the  esti- 
mates that  are  robust  in  the  p-Levy-Vasershtein  sense,  we  will  need  to  bound 

the  absolute  values  of  the  S (X  ) components  s,  (X  ) ; i *=  l,...,p,  for  every 

n n tn  n J 

i and  every  n . That  restriction  is  mostly  very  useful  realistically  when- 

x 

ever  we  are  seeking  the  estimation  of  parameters  whose  values  (we  know  in  ad- 
vance) move  inside  a limited  interval.  The  value  restriction  on  the  estimates 
rejects  a priori  the  unacceptably  (or  dangerously)  false  decisions  on  the 
parameters  of  interest. 

Similarly  to  the  method  presented  in  [18],  we  will  brake  the  constructive 
analysis  of  the  p-Levy-Vasershtein  weakly  robust  estimates  into  two  parts. 

One  for  sample  sizes  n bounded  from  above  by  some  nQ  and  one  for  the  n's 
that  exceed  this  bound  n 

o 

We  proceed  first  with  the  bounded  n part,  presenting  the  following  lemma. 
Lemma  3 

Let  an  estimate  S^(Xn)  = ^S£n^n)>  1 < i < p)  be  absolutely  bounded  for 
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every  i,  n,  Xn  and  let  all  the  ®£n(xn)  components  be  continuous  as  real 
functions  defined  on  the  En  Euclidean  space  V n . Then,  given  some  natural 
number  n and  some  e > 0 , there  is  some  6(e,n  ) > 0 such  that  Y n < n and 
£°r  Qn  : 

<4.  «n'V  < «<*> 

P 

it  is  implied  that: 

dv  (D°(Sn),D(Sn))  < e 


The  proof  of  this  lemma  is  presented  in  appendix  A . The  continuity 

of  s.  (X  ) as  a real  function  is  from  the  p(X  ,Y  ) distortion  measure  on 
xn  n n n 

the  data  that  is  incorporated  in  the  Levy  distance,  to  the  absolute  value 

difference  Is.  (X  ) - s.  (Y  ) | of  the  estimates.  No  consideration  of  par- 
1 in'  n in'  n'  1 r 

ticular  dependence  structure  of  the  data  was  necessary  at  this  point. 

The  lemma  we  present  next  combines  the  properties  of  the  estimator  that 
satisfy  the  p-Levy-Vasershtein  weak  robust  requirements  for  finite  as  well  as 
infinite  sample  sizes  n . 

For  the  transition  to  the  infinite  n step,  the  specification  of  a par- 
ticular dependence  structure  of  the  data  is  necessary.  In  particular,  we  will 
assume  that  the  family  of  data  statistics  considered  is  limited  to  m-dependent 
distributions.  In  addition,  the  data  will  be  collected  in  groups  of  k con- 
secutive data  and  the  groups  will  be  in  distance  of  m data  from  each  other. 


Specifically,  the  data  vector  X^  will  consist,  in  this  case,  of  k-dimensional 


vectors  X^;  i = 1,2,...  . The  components  of  each  X^  vector  are  depen- 

dent, but  X^  is  independent  of  X^  for  i 4 j . 

The  experimental  distribution  of  the  vector  X^  is  defined  then  as 


follows 
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k 1 

= — {#  of  X^'s  with  x.^  < u^,  x2  < u^, . . . ^ < o^.) 
n 1 

(26) 

A continuity  of  the  estimate  that  is  actually  a property  of  the  estimator  in- 
dicating stochastic  stability  around  the  data  central  distribution  Q°  is 
defined  below: 


Definition  3 

For  data  consisting  of  X^  independent  vectors  and  experimental 

distributions  defined  by  (26),  an  estimator  S (X  ) = (s,  (X  ) ; 1 < i < p) 

n n in  n — — 

is  continuous  at  Q°  if  and  only  if: 

Given  e > 0 , there  is  some  u,(e)  > 0 , n such  that:  For  every  X 

o n 

satisfying 

dL  (nX  » <£>  < ^<«> 

P n 

dL  (nY  • Qk)  < ^<e> 
p m 

and  n > n 

o 

it  is  implied: 


max 


s.  (X  ) - s,  (Y  ) I < e 
in  n 1 


im  m 


Combination  of  lemma  3 and  definition  3 leads  to  the  following  lemma 
whose  proof  can  be  found  in  Appendix  A. 

Lemma  4 

Let  ®n(^n)  = ^sin^n^’  1 < i < p3  be  absolutely  bounded  for  every  i, 

n,  Xn  . Let  every  sin(Xn)  be  a p-continuous  real  function  on  En  Vn  and 

let  Sn(Xn)  be  continuous  at  Qfc  where  Xq  is  formed  from  nj^  independent 

k-dimensional  data  vectors.  Then  the  estimate  S (X  ) is  p-Lwy-Vasershtein 

n n 

weakly  robust. 


V 
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The  p included  in  lemma  4 and  definition  2 indicates  the  distortion 

measure  included  in  the  LWy  distance  while,  as  we  show  in  the  previous  section, 

the  Vasershtein  type  performance  d_.  (•,•)  is  the  result  of  the  consideration 

Vt 

of  the  square  error  data  measure  in  the  expression  of  the  Vasershtein  dis- 
tance. 

The  estimator  properties  that  guarantee  weak  robustness  through  the  con- 
tamination and  performance  measures  considered  in  this  section  are  similar  to 
the  ones  of  the  classical  Levy-Levy  model  analyzed  in  [10]  and  [18].  Here  the 
absolute  boundness  of  the  estimate  is  an  additional  desirable  property. 

The  estimators  that  are  not  weakly  robust  in  the  Levy-Levy  sense  are  not 
robust  in  the  Levy -Vasershtein  sense  also. 

The  means  for  the  design  of  the  properly  "robust  estimates  are  similar  in 
both  of  the  above  cases. 

In  the  following  section  we  present  an  alternative  analysis  method  that 
incorporates  convergence  rates  and  gives  us  a better  feeling  as  to  the  proper 
design  methods  for  Levy  contamination.  Levy  performance  robust  estimates. 

5.  A NEW  APPROACH  TO  LfeVY-LfeVY  ROBUST  ESTIMATION 

The  robust  models  that  were  considered  by  Hampel  [10]  and  Papantoni-Kazakos 
[18]  were  based  on  Levy -contaminated  data  distribution  families  and  Levy  per- 
formance criterion  of  the  estimates.  Specifically,  if  p(Xn>Yn)  is  some  dis- 
tortion (penalty)  measure  defined  on  the  data,  robustness  is  defined  as 
follows  according  to  this  model: 

Definition  4 

A sequence  of  p -dimensional  estimates  is  LWy-Levy  weakly  ro- 

bust at  Q°  if  and  only  if  given  e > 0 , there  is  some  6(e)  > 0 such  that: 


For  every  Q such  that 
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dL  (Q°,Q)  < «(•) 

P 

it  is  implied  that: 

d (D°(S  ) ,D(S  ))  < e ; V n 

Jj  n n 

P 

Q A A 

D (S  ) , D(S  ) are  the  estimate  p-dimensional  distributions  induced  by 
Q (Xn) , QCX^),  respectively.  Also,  the  distributions  Q°,  Q generate,  in 
general,  dependent  vectors. 

If  n^  is  some  finite  natural  number,  the  stability  property  expressed 

by  definition  4 is  satisfied  for  n < n if  the  components  a (X  ) of  the 

~ o r in'  n 

A __ 

estimate  Sn(Xn)  are  all  continuous  as  functions  with  E Euclidean  domain, 
where  n any  natural  number.  The  proof  of  this  is  similar  to  the  corresponding 
proof  for  independent  data  appearing  in  [18]  and  to  the  proof  of  lemma  3 of 
the  present  paper  that  can  be  found  in  appendix  A.  Formally  speaking: 

Lemma  4 

If  Sn(xn)  is  continuous  as  a function  on  E V n , then,  given  e > 0 

and  n , there  is  some  6(®,n  ) > 0 such  that: 
o o 

-For  every  distribution  Q satisfying  . 

dL  (Q°,Q)  < 6(e,no) 

P 

it  is  implied: 

d (D°(S  ) ,D(S  ))  < e ; V n < n 
ii  n n — o 

P 

For  data  samples  that  are  unlimited  in  number,  we  would  like  to  investi- 
gate properties  of  the  estimates  that,  in  addition  to  satisfying  the  conditions 
in  definition  4,  they  also  guarantee  fast  convergence.  Such  an  analysis  will 
provide  the  designer  with  the  additional  valuable  information  of  the  sample 
sizes  necessary  to  satisfy  a given  performance.  The  performance  measure  in 
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this  case  is,  of  course,  the  Levy  deviation  of  the  estimate  whenever  the  data 
distributions  move  inside  a certain  sphere. 

Before  we  proceed  in  the  analysis,  we  need  the  assumption  of  a certain 
dependence  structure  of  the  data.  As  in  the  previous  section,  we  will  assume 
that  the  data  vector  consists  of  n^  k-dimensional  independent  vectors 

Xj^j  , j s 1 , . • . > n — kn^  • 

To  avoid  unnecessary  generalities  and  to  make  the  analysis  more  meaningful, 
we  will  also  consider  a particular,  quite  general  form  of  estimates.  Specifi- 
cally, we  will  assume  that  to  estimate  the  component  s^  of  the  parameter 
vector  3^  we  apply  a different,  in  general,  continuous  transformation  on 
each  of  the  first  q k-size  data  blocks  and  form  a linear  combination  of 
these  transformations.  We  repeat  the  same  transformation  to  the  next  qk 
data  block,  etc.  and  we  finally  average  out  the  resulting  values.  The  assump- 
tion is,  of  course,  that  we  always  receive  data  in  qk  size  blocks.  To  ex- 
press the  above  description  mathematically,  we  write: 


MV  - sj,  <27> 

where  ^ the  jAth  k-size  block  of  data  from  the  vector  Xn^k  , and 
a continuous  scalar  function  on  the  E Euclidean  space  for  every 
1 < A < q . The  continuity  of  guarantees  satisfaction  of  lemma  4. 

We  consider  estimators  of  the  same  general  nature  for  all  the  components, 

therefore  we  finally  obtain  a system  of  estimates  as  given  by  (27)  for  1 < 
i < p . We  will  observe  at  this  point  that  since  the  k-dimensional  data  vec- 


tors X^j  have  been  assumed  to  be  independent  from  each  other,  the  functions 
p.iXCXk  are  independent  random  variables  for  different  jA's. 

For  convenience,  we  will  pick  here  the  distortion  measure  p that  is  in- 
cluded in  the  LWy  distances  of  definition  4 to  be  given  by: 
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P0<W  “ n.^K  ' *l';  Xn  ' (V  1 ' 1-— nl: 

Yn  " fyi;  1 = (28) 

Some  different  distortion  (penalty)  measures  p(*,*)  lead  to  analysis  similar 
to  the  one  that  will  be  presented  in  this  section,  therefore  the  choice  in 
(28)  is  not  truly  restrictive. 

Starting  on  the  analysis  of  the  estimators  in  (27),  let  us  suppose  that 
for  some  e > 0 , some  n,  and  some  Q°,  distributions  we  have: 


dL  <D°<§„qk>-Dl<V»  < * 

<>o 


(29) 


According  to  Strassen  ([2],  Th.  11)  the  condition  (29)  is  true  if  and  only 
if  there  is  some  2p-dimensional  distribution  D°^(*,*)  with  D°(»),  D^(*) 
marginals  such  that 

P i . n 9 


D°lc?  1 1*  - t‘u<*k,Jpi  !>*’<« 

1"*1  j“I  Xj**  L 


(30) 


In  (30),  the  estimate  form  in  (27)  is  considered  and  X,  . , Y.  . . are  dis- 

K,JX' 

tributed  as  in  Q°,  Q*  respectively. 

The  joint  distribution  D°*  can  be  translated  to  a distribution  Q°* 
with  Q°,  marginals  instead  through  a specific  estimate  choice.  Since 
the  Q°,  Q*  distributions  are  representing  data  of  k-size  independent  blocks, 
Q°*  and  D°*  will  be  such  that  they  maintain  this  independence.  In  other 


words,  D°^  in  (30)  should  be  such  that  the  differences 


) - 


^)]  are  independent  from  each  other  for  different  jt  values. 
Observing  expression  (30),  we  see  that  due  to  the  truth  of  the  inequality 

£ ^olflj  “ q 

j = 
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n q 


- D°1(,X  '»  I*  .VMX.JX*  ■ “lA  ,JX>'  I * *1’ 

Klsp  J — 1 Ju—  1 


i D°1(F1f1  si  A - Ulje(Yk>ji)l  I > .) 


n q 


to  have  (29)  satisfied,  it  is  sufficient  that: 


D°lfS  I E f Vf‘u<xk,Jjl>  - I > «)<  £ i V t:l  < i < P 

3 (31) 

So,  if  for  some  Do1  choice  with  Q°,  Q marginals,  (31)  is  satisfied,  so 
is  (29).  For  additional  simplification  we  will  assume  that  we  are  working 
on  distribution  spaces  that  generate  stationary  data.  Then, 

- V - fftk.mr  - V ; T mr 

- X - - V ’ V ^ “ 

In  this  case  the  distances  d_  (Q°,Q),  d_  (D°(S  ),  D(S  ))  are  generated 

l ii  n n 

Po  Q ,PD 

by  the  k-dimensional  distributions  Q°,  ° Also,  we  can  then  define: 

"u  - 


n q 


”L  ■ EQifi*u<Yk.Ji>) 


(32) 


where  m^,  m.^  are  independent  of  j i . From  (32)  and  (31)  we  obtain  that 
if  we  want  (29)  satisfied,  it  is  sufficient  to  require  the  following  condition: 

D°lf  1 “ jx  + 


where 


+ 2 a«[mt« 

je=i  1 


o 1 
m 


U 


] | > c}  < - ;Ti:l<i< 


(33) 


‘ “ mU  ’ "L  (34) 

Directly  from  (33)  . we  can  express  the  following  stronger  condition,  to  guarantee 
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satisfaction  of  (29) : 


d01(|“  A - ”U>  • '“u'VjP  - -a"  I 


n q 


> * -Akii-u  -«uH  <7 : i,i:i<i<p 

J&—1 


(35) 


At  this  point  we  find  necessary  to  summarize  our  analysis  up  to  this  point  in 
the  following  Corollary: 

Corollary  1 

If,  given  e > 0 , there  is  some  n and  some  6(e,n  ) > 0 such  that: 

a a 

V n > n and  every  Q*  satisfying: 

“a  k 

dL  «£•<£>  < 

^o 

there  is  some  D°^  with  marginals  q£,  implying: 


Dk1(  l;A  A*xll>‘a<xk,ji>  ■ "!«’  - - “L”  I 


n q 


> « - s l=fll»u  - “ul1  i v 1<  i<p 


(36) 


where  p.^*)  continuous  on  E for  l<£<q;  1 < i < p , then  the  esti- 
mate described  by  (27)  is  Levy-Levy  weakly  robust  according  to  def inition_4_ 


Now  that  we  have  summarized  the  observations  up  to  now,  we  will  continue 
with  analysis  of  the  conditions  of  corollary  1. 

Observing  condition  (36)  we  realize  that  if  we  want  it  satisfied  for 
given  e > 0 , the  sum 


E la^  | W°Ll  - mjj 


4=1 


(37) 
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must  be  smaller  than  ® . In  other  words,  for  given  ® > 0,  ng  we  would 
like  to  find  some  6(®,n  ) > 0 that  first  guarantees  the  satisfaction  of 
the  inequality 

q o i 

S |a^  | Jm^  - I ^ > where  > 0 and  such  that  < e 

1 

(38) 

for  every  Q^id^  p **  6(®Vna)  and  then  it  secures  the  existence  of 

some  D®1  with  q£,Q*  marginals  that  satisfies  (36). 

Let  us  pick  = */2  . Then,  if  is  bounded  absolutely  by 

some  constant  B for  every  1 < i < 1 < l < q and  ^ , we  know  from 

lemma  3 and  its  proof  in  appendix  A that  given 


6 


there  is  some 


«/— ^ ,k,B\  > 0 

ta1*1  ) 

such  that 


q 

z 

A=1 


k 


im 


U 


- m 


tJt 


After  this  last  observation,  we  can  go  one  step  further  and  express  the  fol- 
lowing corollary  that  is  a simplified  form  of  corollary  1: 

Corollary  2 

Let  the  set  of  estimates  in  (27)  be  continuous  on  and  ab- 

solutely bounded  by  some  B > 0 for  every  1 < < q,  l<i<p  . 


Let 
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q 

2 S |a 
\ 1=1 


, k,B 


1 


> 0 


be  a constant  such  that:  if  is  satisfying 


VQ^)<6i  9 


, k,  B\ 


2 E la » | 


then 


1=1 


S kIKjt  - ”ul  <| 

1=1 


for  some  given  e > 0 . In  this  case,  if  for  the  same  above  given  e > 0 , 

and  for  some  natural  number  n , there  is  some  6(e,n  ) > 0 : 

8 8 


fi(e»na)  < 6 


* . 

• 2 S \», 
\ 1=1  J 


, k,  B) 


that  for  every  n > n and  Q,  satisfying 

*”  8 K 

there  is  some  D°^  with  marginals  that  satisfies  the  condition: 


- *u'  - i»u<w 


n q 


y i < i < p 


(39) 


Then  the  estimate  in  (27)  is  Levy-Llvy  weakly  robust  by  definition  4. 


The  boundness  condition  on  the  estimates  that  appeared  in  section  4 is 
included  again  in  corollary  2.  As  mentioned  before,  this  condition  is  a 
realistic  property  that  protects  the  estimate  values  from  wandering  in  the 
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space  of  unacceptable  values. 

We  will  now  concentrate  our  attention  on  expression  (39),  hoping  to  obtain 
a comparatively  small  lower  bound  on  the  sample  size  n that  satisfies  the 
bound  e/p  . That,  of  course,  we  expect  for  particular  choices. 

We  will  point  out  again  that  the  brackets 

- “i/1  - t'WYV  ■ "i-e11  (40) 

in  (39)  are  zero  mean  independent  variables  for  every  different  jj£  value.. 

A theorem  expressed  by  Revesz  ([5],  pg.  57)  will  be  extremely  useful  here. 
We  state  the  theorem  below. 


Theorem  1 


Let  x^,x2,...,xn  be  independent,  zero  mean,  not  necessarily  identically 
distributed  variables.  Then,  the  probability 


Pn(7»  “ PrC  I— 


x,  + X2  + . . . + x 


n 


-|  > T\. ) 


converges  to  zero  exponentially  for  any  T|  > 0 , i.e.,  there  is  some  C > 0 
and  some  0 < v < 1 such  that : 


n 


Pn(Tl)  < Cv 

if  and  only  if:  For  all  7]  > 0 there  exists  a constant  £_  > 0 and  some 


T) 


t^  > 0 such  that : 


n E(e  k)  < g,  elfcl%  whenever  )t  j < t_ 
k=l  11  11 

Also,  the  probability  cannot  converge  faster  than  exponentially  and 

the  constants  C and  v that  express  the  bound  on  the  probability  P (T]) 


are  chosen  as  follows:  C = £^;  v 


(ft-T|)-t 


11. 


; where  6 some  value  in  the 


interval  (0,T))  . 
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We  will  apply  the  above  theorem  on  condition  (39)  to  design  robust  esti- 
mates that,  in  addition  to  satisfying  the  continuity  and  boundness  properties 
expressed  in  corollary  2,  they  also  have  the  strong  characteristic  of  the 
fastest  possible  convergence  to  their  asymptotic  value,  which  through  the  ab- 
solute boundness  is  guaranteed  to1  be  stable  inside  stationary  Q° -Levy -contami- 
nated distribution  families,  whenever  the  contamination  is  small  enough. 

Directly  from  theorem  1,  from  the  fact  that  the  expressions  in  (40),  are 
independent  and  zero  mean  for  different  j£  values  and  from  the  observation 
that  the  sums 

- "ui  - - "u”  <4» 

are  identically  distributed  for  every  j , we  obtain  that  the  left  part  of 
inequality  (39)  converges  exponentially  if  and  only  if: 

For  all  e > 0 there  exists  some  constant  ^ 0 and  some  te/2  ^ 0 

such  that 


E .{e 
l<Kq  D° 


[ n 


’)n<*e 


It  lf» 


/2 


(42) 


V |t  | < t 


e/2 


Due  to  theorem  1,  the  larger  t we  can  find  the  faster  the  converg- 
ence of  the  estimators  to  their  asymptotic  value.  Also,  we  must  emphasize 
here  that  we  are  seeking  a t^^  t^lai  is  common  for  all  that  are  members 

of  the  data  distribution  contaminated  family. 

As  an  additional  observation  on  (42),  we  see  that  since  its  left  part_is 
equal  to  one  for  t = 0 , &e/2  cannot  be  smaller  than  one.  Seeking  the 
smallest  possible  &c/2  we  may  as  weli  pick  &e/2  = * • 

In  this  case,  condition  (42)  can  take  the  form: 
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E-tnE  < |M« 

« "E1 


v |t|<tc. 


where  t”^  some  positive  value  and  some  2k-dimensional  distribution 

with  Q°,  marginals.  Each  of  the  logarithmic  expressions  in  (43)  is  a 
convex  U function  of  t with  minimum  at  t = 0 and  minimum  value  equal 
to  zero.  This  is  true  done  to  the  fact  that  each  of  these  logarithmic  functions 
has  positive  second  derivative  for  every  t and  first  derivative  at  t = 0 
that  is  equal  to  zero.  The  above  observations  are  true  for  every  Q°,  Q^, 

choice.  Also,  the  sum  of  the  logarithmic  functions  in  (43)  is  also  con- 
vex U with  minimum  that  is  equal  to  zero  and  happens  at  t = 0 . 

Due  to  the  above  observations,  there  is  always  some  te/2  (f°r  given 
« > 0)  that  satisfies  (43).  The  analysis  should  be  concentrated  now  to  de- 
signing the  constants  a^  and  the  functions  p^O)  in  a way  that  will  make 
the  common  for  all  distributions  in  the  contaminated  family  fce/2  as  large 
as  possible. 


Before  we  express  some  thoughts  on  that  we  will  state  the  following  theo- 


Theorem  2 

Let  the  set  of  estimates  in  (27)  be  continuous  E and  absolutely 
bounded  by  some  B > 0 for  every  l<£<q;  1 < i < p . Then,  if  for  some 
6 > 0 and  such  that  it  is  nonlarger  than  the 


included  in  corollary  2,  a common  fce/2  '>  ® can  f°und  for  all  the  members 
of  the  Q°-contaminated  family  characterized  by: 
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Vtv  5 • 

where  fce/2  suc^  that  it  satisfies  the  expression 


(44) 


q ta.fp,.  A\)-m°  1 -ta*(|i  .(X,)-m1.'J 

ZAnE  {e  1 U3E.{e  1 iA1c  U 3 < t £ (45) 

« < <4  " 2 


V jt  1 < t /9;  V Q?;  satisfying  (44) 


•c/2»  ¥ yk 

the  estimate  in  (27)  is  weakly  robust  and  it  converges  to  its  asymptotic 

value  with  rate  expressed  by  the  bound: 

e 

nit- 
e 


. 2 n;  C e (o,  i) 


(46) 


The  absolute  boundness  of  y,^(-)  guarantees  stability  of  the  p,.,^(*)  asymp- 
totic value  inside  the  contaminated  family  expressed  by  (44). 

Also,  for  performance  equal  to  « (where  « > 0)  it  is  sufficient  that 
we  choose  then  number  of  samples  equal  to: 


An  - 


n = 
a 


" 2)te/2 


(47) 


Q is  some  value  in  (0,e/2)  and  it  is  the  same  in  both  expressions  (46)  and 
(47). 


In  theorem  2 we  picked  D°^(X,Y)  = q£(X)-Q^(Y).  We  will  now  concentrate 
on  expression  (45)  to  draw  some  useful  conclusions.  Indeed  we  can  separate 
the  expression  (45)  into  two  parts:  One  including  the  behavior  of  the  esti- 
mate at  the  central  distribution  and  one  describing  the  same  behavior  at 

the  distributions  included  in  the  contaminated  family  in  (44). 

Through  the  separation  we  just  mentioned  we  can  write  (45)  in  the  following 


way: 
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I Xn  £ u'WV-fy  + | E ( < ,tl. 

i-l  Q°  *«l 

. ..  V|t|<t,/2  (48) 

Both  sums  on  the  left  part  of  (48)  are  convex  U with  minimum  equal  to  zero 
and  assumed  at  t = 0 . The  less  sharp  both  of  these  sums  are  as  functions 
of  t , the  larger  t^/2  t>e  for  given  e > 0 . Since  is  a well- 

defined  distribution,  the  set  (a^,  p,^(#)>  1 < X < q , 1 < i < p)  can  be  de- 
signed to  make  the  sum 


Sin.  (eta^(V 
i=1  <£ 


(49) 


as  less  sharp  at  t = 0 as  possible. 

We  will  call  an  estimate  {a^,  ’ 1 < ^ < 95  1 < 1 < in  (27) 

a fast  estimate  at  Q°  if  it  makes  the  expression  in  (49)  a slowly  increasing 

function  of  t around  t = 0.  .The  continuity  of  the  functions  p,  (•)  on 
1c 

E and  their  boundness,  guarantees  closeness  of  the  moment  generating  func- 
tions . - /V  \ ° 1 

w f -taX^U(Xk)“mi t>, 

0lCe  5 

k r "taji^ij&(V"mijeK 

to  the  central  moment  generating  function  E le  J 

1 4 ‘ ' Q° 

for  Qk  belonging  to  the  contaminated  faaily  (44).  Therefore,  for  fast  ex- 

=ponentiali:eoraverganee,  .then $ it _-ds  v sufficient  to  design  the  estimator  in  (27) 


to  make  the  logarithmic  expression  of  the  moment  generating  functions  at 
Q°  (expression  (49))  as  slowly  increasing  with  t around  t = 0 as  possible. 
AS  conclusion, rwe  finally  express  the  following  theorem: 


Theorem  3 

If  the  set  of  estimates  in  (27)  is  continuous  as  a function  on  Ek,  ab- 
solutely bounded  for  every  l<4<q,  1 < i < p and  it  is  a fast  estimate 
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at  , then  it  is  also  weakly  robust  at  , and  it  converges  fast  and 
exponentially  to  its  stable  mean. 

The  advantage  of  the  method  presented  in  this  section,  in  comparison  to 

Hampel's  method  is  that  the  design  of  the  weakly  robust  estimates  reduces  to 

real  function  properties  and  to  moment  generating  functions  at  the  central 

distribution  Q°  . Furthermore,  this  method  has  the  tremendous  advantage  of 

treating  finite  data  samples  (through  the  smallest  n that  achieves  the  re- 

a 

quired  performance)  and  not  just  asymptotic  situations  as  in  [ 1] . Finally, 
the  Levy -performance  criterion  is  stronger  than  the  small-variance  criterion 
treated  by  Huber  [ 1]  . 

Finding  the  smallest  te/2  *-n  t^eorem  2 that  will  satisfy  all  members 
of  the  Q°-contaminated  family  is  a t ask ? samples  of  which  will  be  shown  in 
the  following  section. 

Concluding  this  section,  we  will  mention  that  theorem  1 has  also  been 
applied  in  [ 19]  to  find  confidence  intervals  for  Bayes  error  estima- 

tion which  were  subsequently  used  to  determine  the  optimal  degree  of  quanti- 
zation. 

6.  FURTHER  DISCUSSION  ON  SECTIONS  3 , 4 AND  5 

In  this  section  we  will  present  some  discussion  on  the  application  of 
the  theory  that  appears  in  the  previous  sections.  Our  discussion  will  be 
mostly  oriented  toward  expressing  methods  and  suggesting  ideas" for  the  actual 
design  of  the  robust  estimators  under  consideration.  ~The  possibilities'  for 
such  designs  are  many  and  lengthy  and  they  will  appear  in  detail  in  future  work 
(under  preparation). 

1.  We  will  first  start  with  the  Vasershtein  robust  estimators  that  are 
analyzed  in  section  3.  According  to  lemma  2,  such  an  estimator  (where  the 
underlying  distortion  measure  is  the  square  error)  must  be  linear.  That  is: 
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VV  - tStn(V  i 1 - 1.  • • • .Ph  »ln<V  anl(k)%  (50) 

Also,  a sufficient  condition  for  robustness  in  this  case  is  that  each  of  the 

sequences  - — 

n 

{ E |a  , (k)  p ; i - l,...,p 
k=l 

converges  to  some  finite  value..  Any 

t S |ani00D 

k=l  ni 


.that  is  geometric  with  multiplying  jEactor.  u>  smaller  than  one  hftlqpgs..  to  .the, 
above  group.  Also,  the  experimental  mean  is  obviously  Vasershtein  robust. 

In  general,  one  will  look  for  linear  estimates  that  converge  at  the  cen- 
tral well-known  distribution  to  the  desired  value  and  whose,  coefficients 
satisfy  the  absolute  convergence  property  mentioned  above.  For  example,  let 
the  central  distribution  Q°  be  m-dependent.  Then  accumulate  the  data  in 
m-groups  and  form  the  estimate 


1 ™ 1 


„ (X  ) - f1  E a*  (k)x. 
i,nm  nm  n nm,i  k 


(51) 


--where^the  previous-  coa££ix.iant;ai.^r(k)  . ,An  (50)  are  r elated. aL™  ..ikl  'a 

ni  nm,i 

in  (51)  hy 


a_  ,00  = ^ (k) 

nm,  i n nm,  i 


(52) 


Due-to  the  m-dependence  of  Q ,~  the  sums. 


1 


(j+l)m 
, E 7 

k=jm 


.nm, 


are  independent  from  each  other  and  according  to  theorem  1,  the  estimate  in 
(51)  converges  exponentially  and-in  -probability  to  - 
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- 1UU  1 LI 

m • lim  n E a . (k)  = m • lim  E a^.(k)  (53) 

o « « ron « im  o * « ni 

n-*»  k=l  9 n-*®  k=l 

where  m the  one-dimensional  mean  determined  by  Q°  . 
o 

n 

If  we  pick  all  the  coefficients  positive  and  design  the  sum  E a (k) 

k=l  ni 

so  that  it  converges  to  the  desired  value  at  Q°  , then  at  the  same  time  the 
robustness  sufficient  condition  is  also  satisfied  and  the  estimate  in  (51)  is 
stable  inside  the  Vasershtein  contaminated  family. 


2.  For  the  Lwy-Vasershtein  robustness  (as  found  in  section  4),  one  should 
seek  for  continuous  as  real  functions  and  continuous  at  the  central  distribu- 
tion estimates  that  are  also  absolutely  bounded.  This  last  property  is  the 
only  additional  to  the  ones  required  by  the  robust  in  the  Hampel  sense  esti- 
mates . 

Of  course,  all  estimates  that  are  not  Hampel  robust  (such  as  the  experi- 
mental mean),  are  not  robust  in  the  present  sense  either. 

Also,  the  L-estimators 
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Finally,  using  the  approach  of  the  M- estimators,  we  can  find  the  estimate 

s . (X  ) as  the  truncated  zeros  of  a sum  1 181 : 
in  n 1 J 


n 


E $ (x  - s (X) ) 
j=1  i j in  n 


(55) 


where  t^(*)  a smooth  enough  function.  The  zeros  of  the  expression  in  (55) 
will  be  passed  through  a nonlinearity  as  the  one  mentioned  above,  or  any  other 
that  cuts  off  all  estimate  values  that  are  absolutely  higher  than  B. 


3.  The  fast  robustness  introduced  in  section  5 is  important  and  deserves 
special  attention.  Indeed,  the  method  introduced  there  can  be  considered 
small  sample  since  the  effort  is  toward  designing  estimates  that,  besides  being 
robust , converge  to  their  stable  (for  small  deviations  of  the  data  distribu- 
tion from  the  central  one)  mean  fast. 

The  measure  of  performance  is  Levy  distance.  In  other  words,  if  a spe- 
cific contaminated  family  of  data  distributions  is  given,  the  stability  of  the 
estimate  in  it  is  measured  through  the  longest  LWy  distance  of  the  estimate 
distribution  at  some  Q from  the  estimate  distribution  at  Q°  , where  Q°  is 
the  central  well-known  data  distribution  and  Q moves  inside  the  contaminated 
family. 

As  explained  in  section  5,  if  p scalar  estimates  are  calculated  from 

the  same  data  and  the  performance  required  is  a fixed  e > 0 , then  the  minimum 

number  of  data  n that  will  satisfy  this  performance  is  given  by 
a 


n 

a 


£n  — 

P 

(£  ~ 2^/4 


(56) 


where  £ is  some  arbitrary  constant  in  (0,e/2)  and  tg^  is  f°r  given 
estimates  (of  (27)  form  and  k-dependent  group  data)  the  smallest  t ^ 


such  that 
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E In  E , le  J < |t  \g  i V |t  | < t . . .. 

w nj  “ Qi.«/4 


(57) 


-where-  moves  in  th-e  considered  data  contaminated  family  and  the  smallest 


: - is  taken  among  these  Q.  members. 

Qk»6/4 


To  achieve  this  minimum  tg^  as  large  as  possible,  we  go  backward. 
Specifically,  we  are  looking  for  this  estimate  set  (a^;  H^CX^)  ; 1 < ^ < q; 

1 < i < p)  that  achieves  the  largest  t - for  the  worst  Q?"  choice  in- 

side  the  specific  given  contaminated  family.  This  worst  case  corresponds  to 

Z in  E le  J 

X— 1 3k 

function  with  the  fastest  increase  for  |t | increasing. 

Specific  continuous  and  absolutely  bounded  functions  can  be 

chosen  with  some  of  their  characteristics  left  flexible.  The  design  of  these 


characteristics  becomes  a maximum  problem  (find  largest  t 


for  the 


Qk,t:e/4 


logarithmic  moment  generating  expression  the  sharpest  in  the  family)  and  the 
‘methods  fen: -solving -It -are  similar  to  the  ones  applied  in  [1]  and  [19]. 
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APPENDIX 
Proof  of  Lemma  3 


In  the  distance  dyt(D°(Sn) ,D(S))  in  (25),  the  terms 


(/a°n(0)  - /ain(0»2 

and  (m°n  “ min^  appear.  We  develop  upper  bounds  on  each  of  them.  Indeed, 
we  obtain: 


</?„<<»  - y^m2  < l/°n<o)  - l/°n(o)  + l 

^2  \ />  / ««>  \ 1*^2 


J ! U 1 f,  5 17  V 


Kn<°>  - "in'W  l - IKn<  ‘ JV<V«dV 


- I <“?/  - I 

' " ■■"•"<  l^n«n>V^>  "■;*?»•<*»)«<«.)  I ♦ Kn>2  - <«ta)2|  <*•« 

If  the  estimate  absolutely  bounded  from  above  by  some  A^  > 0 , 

then  we  obtain  from  (A.l): 

- /s^2  < J|Sln(xn)  + 5ln(Yn)||stn(Xn)  - S.nCYn)l  • 


• + 2AiHSt»<V  - Sln(Yn)  |D(dXn,dYn) 

S4AJISin<V  - Si„<V  lD<dXn>dV 


where  D(Xn,Yn)  some  distribution  with  Q (xn) > Q(Y  ) marginals. 


(A. 2) 

From  (A. 2) 


we  also  obtain,  for  the  same  D(X  ,Y  ) : 

n n 


1U  li 


(Jo°.  (0)  - /arcs'))2  + (m?  - m.  )2 

v in'  in'  ' 'in  in' 

< 6A. Ms.  (X  ) - s.  (Y  ) |D(dX  ,dY  ) 
^ ^ iJ  1 in'  n in'  n 1 ' n n 


robustness  from  dL(QQ,Q)  to  dv(D°(§n> ,D(§n))  we  require: 
That  given  e > 0 , there  is  some  6(e)  > 0 such  that 


'"M  ' ’d£(Qo,-Q)’  6(8)  m dv(D°<Sn)»D(§n))  < e ; Y n 


For 
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Given  e > 0 , there  is  some  6(6)  > 0 : 

For  dL(Qo,Q)  < 6(e) 

• max  (6A.J*  |s  (X  ) - s.  (Y  ) |D(dX  ,dY  ))  < e ; Vn 
i"  in  n in  n n n 

where  D(xn>Yn)  1:138  respective  marginals  Q°(Xn)  and  Qn(Yn)  • given 
6 > 0 , there  is  for  every  1 < i < p some  6 (e)  > 0 : 

IISin<V  ' Sln<V  lD(dVdYn>  < 677  ' T " 

Then 

6(6)  = min  6 (6) 
l<i<p  1 

If  dL(<^n,Qn)  < 6irr(e)  » ther>  these  JU  80133  Y n D<xn»Yn).  wit*  . Qn<V  *Q(V 

marginals  (Strassen  [2],  Th.  11)  such  that 

D(X  ,Y  :p(X  ,Y  ) > 6.  (c) 
n n n n - in 

Pick  this  D(X  ,Y  ) and  write: 
n n 

Ms.  (X  ) - s.  (Y  ) )D(dX  ,dY  ) 

1 in  n in'  n 1 n n 

= T|s.  (Y  ) -8.  (X  )jD(dX  ,dY  ) + Ms,  (Y  ) - s,  (X  ) |D(dX  ,dY  ) 
•J  1 in'  n in'  n 1 n n J 1 in  n in  n 1 n n 

VV^VV^in^  VV^VV^i/') 

""  i2VinW+JISiA)  ’ lD<dVdV 

X ,Y  : p(X  ,Y  ) < 6 (e) 

n’  n n’  n'  in' 

since  |s  (X  ) I < A.  ; V n 
1 in  n 1 — i 


Let  ®£n(xn)  136  continuous  function  everywhere  on  En  . Then,  given 


e > 0 and  X , there  is  some  6(e.,X  ) > 0 such  that 
i n ’ ' i’  n 

v Yn:p(Xn,Yn)  < S(.t.Xn)  - |Sln(Y  ) - Sln(Xn)  I < . 


: 5 : i-  ’ j / jar 
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Choose  a sequence  {cj  of  positive,  monotonically  decreasing  toward 
zero  components  and  define 


Then 


Define 


il  ll 


A.  = (x  :fl(e.,X  ) > cj 
j n -J  » _ j 


Aj  £ A - e"  ; uAj  - En 


U « ) - (Y  :p(X  ,Y  ) < J(«  ,X  )) 
n n n n in 


«<VV 


V° 


U (X  ) :X  € A. 

•<w  ° n 3 


Then  B,  c B j , UB^  - E 

Now,  given  TL  > 0 , there  is  some  k,  (n,T|  ) such  that 

v *•  1 1 1 . * , - • , 

\ 


UB 


j 


> 1 - TL 


Denote:  £,  = UB. 

in  J 


Pick 


^US^Cn.Vj 

UB. 


1 18Ai  ’ 11  36Af 


«i  («)  = mink  (n»~ V)*— V 

1 K1  36A^  36A^ 


Then 
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J'lSin(V  * SlnVn>  lD(dVdV 

< 2*l«lB(.)  + ;|JlBOrn)  - yx.iliKffl,®) 


X €£ 
n in 

Y : p(X  ,Y  )<S.  (e) 
n r'  n’  n in' 


+ Jl**in"n5  * 


X €£? 
n in 

Y : p(X  ,Y  )<fi,  (e) 
n n*  n7  in'  7 


II!  I ! 


IWtal"  t,  2Al\  + Hv<V  - *ln<V 


x €6 

n in 

V<W  < ‘l.'" 


< 2A1Sln(e)  + 2A1111  + «. 


< 2A.  + 2Aa  — V + 18A~  “ fiT" 

1 36At  1 36A?  18Ai  6Ai 


For  n < n , we  can  pick 


6in  ^e)  = min^ — T » min  Clc  <n* — 

o 3 6 AT  n<n  Ki  36AT 

i—o  i 


to  satisfy 


dT  (Q  ,Q  ) < ft.  (e)  = 6.  (e) 

L on  n in'  7 in  ' 7 
• o 


IiSin<Xi„>  - Sln<V  lD<“n>dV  < 6^  ; T n * “o 


Proof  of  Lemma  4 


at-acu  ^ jt»  vaJ 


To  take  care  of  n > nQ  , let  dL(Q  ,Q)  < 6n(e)  for  n > nQ  , where 

c f rO  v»isv*given» v,  Pid&vagaLn^  B(Xi,Yw^  with  Q .Q  marginals  such  that 
i n n on  n 

D<Xn-Yn:P(Xn-V  * W 2 W 


and  write 
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..  KV-%' 


.TISln<Xn>  ‘ Sln<V  lD<'JVdYn>  < 2Wt>  + 

+ S lSi„<V  - W lD<“„'dYn>  in>\ 

y.^v-xw 

For  arbitrary  Q°(*)  there  is  nothing  that  can  be  done  at  this  point.  So, 
let  Q°(‘)  be  m dependent.  That  is,  if  x1,x_,...,x  is  distributed 
” tfcboraingrto  "■rQ<^f“’)T  , "then  every  " x™  depends  on  only  m preceding  and 
following  data. 

Then,  let  Xr  consist  of  X^;  1 “ 1,2,...  vectors  of  consecutive  data 
^hat^lraVe  v m data  gaps'  betweetf  X^  -arid  i+1  . ■ Theti,  i ^ j 1 

are  independent  by  Q°(»)  . Let  n » k^  and  define  the  following  experi- 
mental distribution: 


y i\  a 


k 1 

n£  = — {#  of  2^‘s  with  x1  < u^,  x^<  < u^} 

n 1 

The  typical  sequences  Xq  are  such  that  n£  (u^,...,^)  approaches  q£(X.) 

n 

for  n large  enough. 

Or,  given  T).^  > 0 , there  is  some  nQ  and  some  6(7]^)  > 0 such  that 

.0  k 


Let 


Then 


V n > nQ  =>  Q°{dL  <Q°,n£  ) > 6(^)1  < T\ 
P n 


Snl  ’ (Xn:dL  > < ‘<V] 

P « 


Q (e  .)  > 1 - 1).  ; V n > n 

ni  i o 


Going  back  on  page  18  we  have  now 


H'W  - Stn<V  lD<dV®n>  5 2*lW 


+ Ms.  (x  ) - s.  (Y  )|D(dX  ,dY  ) + p|s  ■ (X  > - s.  (Y  )|D(dX  ,dY  ) 
J 1 in  n in  n 1 n n J 1 in  n in  n ' n*  n 


X €e  . 

* ni 

Y : p(X  ,Y  )<6  (e.) 
n n*  n7  nv  i7 


X €ec, 
n ni 


Y : p(X  ,Y  )<6  (e.) 
n n’  n7  nv  i 


< 2Al«n(.l)  + 2*^  + ;|Sln(Xn)  - Jln(Y n)  |D(dXn,dYn) 


x €e  . 

n ni 

Y : p(X  ,Y  ) < & (e  ) 
n n*  n7  nv  i7 


Let 


k k 


" ' * p(X^;Y^>  < ftn('«£)  (n^  iny  ) < 5 V!  n > n ! - v ™ «•« 

p n n . ° 

where  depends  on  the  measure  p(*,*)  . Then, 

Xlsin<V  - slnan)  Ndxn,dYa) 

< 2hW  + 2AA  + X |SlnCV  - Sln(Vn)  |D(dXn,dTn) 

v - ■ -■  - *4  )<£„(«(«!))  r 

p n n v 

d^jQ*,*)  satisfying  the  triangle  inequality  and,  being  symmetric,  we  have: 


o k 


p n-- 

Soi^fronr  above  we  have 


k k 

jJ 


p — : n n 


ti  ri 


+ Xl!in«n>  * 

' VVdL  «£«’x  ><8<V 

p - n • 

dL  (Q^  t<6(TliT+  C (6(e ' 
p n p 

Let — s.  (X-)  be  continuous  at  -Q  . That  is,  given  /3->=0  , there  is 
in  n ok  i 

some  n(®^/3)  > 0 such  that 


r\  ; r*. 
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T X„:dL  <4  >4>  < l*<V3> 

p n 

“ - °i«£>  I < V3 


where  s . (Q.  ) the  value  that  s . (X  ) converges  for  n -*  ® . 
i k in  n 


Choose: 

T) 


1 6Ai  3 6 A?  ’ 1 6A1 


6 (Tip  = minCSC^ 


6(6^  = minCCp1 


,)>  ^(18A  )) 


3 6 A, 


) 


18A.J/*  ,,.2 
l 36A^ 


nv  s'  \ w; 


Then 


in  » a 


2*^)-*^  + J|Sln<V  - St„<V  lD<'JVV 


W^L  <4*4  > < *<\> 

P n 

dL  <\°4  > < ft<V  + Cp(fi(6i>) 
p n ^ 


< — 1_  + — 5L  + — * !_ 

- 18A1  WAj^  18At  BAj^ 


Finally,  pick  for  every  i 


fi(«)  ■ min  5(t  ) 
i 


mln(-^>  • "i"  V(n*  77?'  C^ajh*? 


X8aT  n<n  ^i  48A* 

l-o  i 


and  then  idv(D°(§n)  ,D(§n))  < e 


dL  (Q  ,Q)  < fl(«)  . 


for 


